Design Of A Lexical Database For Sanskrit
نویسنده
چکیده
We present the architectural design rationale of a Sanskrit computational linguistics platform, where the lexical database has a central role. We explain the structuring requirements issued from the interlinking of grammatical tools through its hypertext rendition.
منابع مشابه
An Effort to Develop a Tagged Lexical Resource for Sanskrit
In this paper we present our efforts the first time of its kind in the history of Sanskrit to design and develop a structured electronic lexical Resource by tagging a Traditional Sanskrit dictionary. We narrate how the whole unstructured raw text of Vaacaspatyam – an encyclopedic type of Sanskrit Dictionary has been tagged to form a user friendly e-lexicon with structured and segregated informa...
متن کاملSanskritTagger: A Stochastic Lexical and POS Tagger for Sanskrit
SanskritTagger is a stochastic tagger for unpreprocessed Sanskrit text. The tagger tokenises text with a Markov model and performs part-of-speech tagging with a Hidden Markov model. Parameters for these processes are estimated from a manually annotated corpus of currently about 1.500.000 words. The article sketches the tagging process, reports the results of tagging a few short passages of Sans...
متن کاملCompound Type Identification in Sanskrit: What Roles do the Corpus and Grammar Play?
We propose a classification framework for semantic type identification of compounds in Sanskrit. We broadly classify the compounds into four different classes namely, Avyayı̄bhāva, Tatpurus.a, Bahuvrı̄hi and Dvandva. Our classification is based on the traditional classification system as mentioned in the ancient grammar treatise As. t .ādhyāyı̄ by Pān. ini, written 25 centuries back. We construct ...
متن کاملDesign and Implementation of an Intelligent Part of Speech Generator
The aim of this paper is to report on an attempt to design and implement an intelligent system capable of generating the correct part of speech for a given sentence while the sentence is totally new to the system and not stored in any database available to the system. It follows the same steps a normal individual does to provide the correct parts of speech using a natural language processor. It...
متن کاملAn Approach for Grammatical Constructs of Sanskrit Language using Morpheme and Parts- of-Speech Tagging by Sanskrit Corpus
Sanskrit since many thousands of years has been the oriental language of India. It is the base for most of the Indian Languages. Statistical processing of Natural Language is based on corpora (singular corpus). Collection of texts of the written and spoken words is known as Language corpus, which is collected in an organized way, in electronic media for the purpose of linguistic research. It pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004